High throughput sequencing of multiple transcripts

ABSTRACT

The present disclosure generally relates to sequencing two or more genes expressed in a single cell in a high-throughput manner. More particularly, the present disclosure relates to a method for high-throughput sequencing of pairs of transcripts co-expressed in single cells (e.g., antibody VH and VL coding sequence) to determine pairs of polypeptide chains that comprise immune receptors.

The present application is a continuation of U.S. application Ser.No.14/407,849, filed Dec. 12, 2014, as a national phase applicationunder 35 U.S.C. § 371 of International Application No.PCT/US2013/046130, filed Jun. 17, 2013, which claims the prioritybenefit of U.S. provisional application No. 61/660,370, filed Jun. 15,2012, the entire contents of each of which are incorporated herein byreference.

BACKGROUND OF THE INVENTION 1. Field of the Invention

The present invention relates generally to the field of molecularbiology and immunology. More particularly, it concerns methods forhigh-throughput isolation cDNAs encoding immune cell receptors andantibodies.

2. Description of Related Art

There is a need to identify the expression of two or more transcriptsfrom individual cells at high throughput. In particular, for numerousbiotechnology and medical applications it is important to identify andsequence the gene pairs encoding the two chains comprising adaptiveimmune receptors from individual cells at a very high throughput inorder to accurately determine the complete repertoires of immunereceptors expressed in patients or in laboratory animals. Immunereceptors expressed by B and T lymphocytes are encoded respectively bythe VH and VL antibody genes and by TCR α/β or γ/δ chain genes. Humanshave many tens of thousands or millions of distinct B and T lymphocytesclassified into different subsets based on the expression of surfacemarkers (CD proteins) and transcription factors (e.g., FoxP3 in the TregT lymphocyte subset). High-throughput DNA sequencing technologies havebeen used to determine the repertoires of VH or VL chains or,alternatively, of TCR α and β in lymphocyte subsets of relevance toparticular disease states or, more generally, to study the function ofthe adaptive immune system (Wu et al., 2011). Immunology researchershave an especially great need for high throughput analysis of multipletranscripts at once.

Currently available methods for immune repertoire sequencing involvemRNA isolation from a cell population of interest, e.g., memory B-cellsor plasma cells from bone marrow, followed by RT-PCR in bulk tosynthesize cDNA for high-throughput DNA sequencing (Reddy et al., 2010;Krause et al., 2011). However, heavy and light antibody chains (or α andβ T-cell receptors) are encoded on separate mRNA strands and must besequenced separately. Thus, these available methods have potential tounveil the entire heavy and light chain immune repertoires individually,but cannot yet resolve heavy and light chain pairings at highthroughput. Without multiple-transcript analysis at the single-celllevel to collect heavy and light chain pairing data, the full adaptiveimmune receptor, which includes both chains, cannot be sequenced orreconstructed and expressed for further study.

SUMMARY OF THE INVENTION

In a first embodiment, the present invention provides a methodcomprising (a) sequestering single cells and an mRNA capture agent intoindividual compartments; (b) lysing the cells and collecting mRNAtranscripts with the mRNA capture agent; (c) isolating the mRNA from thecompartments using the mRNA capture agent; (d) performing reversetranscription followed by PCR amplification on the captured mRNA; and(e) sequencing at least two distinct cDNA products amplified from asingle cell. In certain aspects, the cells may be B cells (e.g., plasmacells or memory B cells), T cells, NKT cells, and cancer cells.

Thus, in a specific embodiment, the present invention provides a methodfor obtaining a plurality of paired antigen receptor sequencescomprising: (a) isolating single mammalian cells in individualcompartments with immobilized oligonucleotides for priming of reversetranscription; (b) lysing the cells and allowing mRNA transcripts toassociate with the immobilized oligonucleotides; (c) performing reversetranscription followed by PCR amplification to obtain cDNAscorresponding to the mRNA transcripts from single cells; (d) sequencingthe cDNAs; and (e) identifying multiple mRNA transcripts (e.g., pairedantigen receptor sequences) for a plurality of single cells based on thesequencing. For example, in some aspects, a method is provided forobtaining a plurality of paired antibody VH and VL sequences comprising(a) isolating single B-cells in individual compartments with immobilizedoligonucleotides for priming of reverse transcription; (b) lysing theB-cells and allowing mRNA transcripts to associate with the immobilizedoligonucleotides; (c) performing reverse transcription followed by PCRamplification to obtain cDNAs corresponding to the mRNA transcripts fromsingle B-cells; (d) sequencing the cDNAs; and (e) identifying the pairedantibody VH and VL sequences for a plurality of single B-cells. Infurther aspects, a method is provided for obtaining a plurality ofpaired T-cell receptor sequences comprising (a) isolating single T-cellsin individual compartments with immobilized oligonucleotides for primingof reverse transcription; (b) lysing the T-cells and allowing mRNAtranscripts to associate with the immobilized oligonucleotides; (c)performing reverse transcription followed by PCR amplification to obtaincDNAs corresponding to the mRNA transcripts from single T-cells; (d)sequencing the cDNAs; and (e) identifying the paired T-cell receptorsequences for a plurality of single T-cells based on the sequencing.

In further aspects, the method comprises obtaining sequences from atleast 10,000, 100,000 or 1,000,000 individual cells (e.g., between about100,000 and 10 million or 100 million individual cells). Thus, in someaspects, a method comprises obtaining at least 5,000, 10,000 or 100,000individual paired antibody VH and VL sequences (e.g., between about10,000 and 100,000, 1 million or 10 million individual pairedsequences). In certain aspect, obtaining paired sequence, such as VH andVL sequences, may comprise linking cDNAs (e.g., VH and VL cDNAs) byperforming overlap extension reverse transcriptase polymerase chainreaction to link cDNAs in single molecules. In an alternative aspect, amethod of the embodiments does not comprise the use of overlap extensionreverse transcriptase polymerase chain reaction. For example, two (ormore) cDNA sequences can be obtained by sequencing of distinctmolecules, such as by sequencing distinct separate VH and VL cDNAmolecules.

In one aspect, the method may further comprise determining nativelypaired transcripts using probability analysis. In this aspect,identifying the paired transcripts may comprise comparing raw sequencingread counts. For example, a probability analysis may comprise performingthe steps of FIG. 9. In a specific aspect, a method may compriseidentifying the paired antibody VH and VL sequences by performing aprobability analysis of the sequences. In certain aspects, theprobability analysis may be based on the CDR-H3 and/or CDR-L3 sequences.In some cases, identifying the paired antibody VH and VL sequences maycomprise comparing raw sequencing read counts. In a further aspect, theprobability analysis may comprise performing the steps of FIG. 9.

Certain aspects of the present embodiments concern mRNA capture agents.For example, the mRNA capture agent can be a solid support, such as abead, comprising immobilized oligonucleotides or polymer networks suchas dextran and agarose. In one aspect, the bead is a silica bead or amagnetic bead. The mRNA capture agent may comprise oligonucleotideswhich hybridize mRNA. For example, the oligonucleotides may comprise atleast one poly(T) and/or primers specific to a transcript of interest.In certain aspects, a bead of the embodiments is smaller than theindividual cells that being isolated (e.g., B cells).

In some aspects, individual compartments of the embodiments may be wellsin a gel or microtiter plate. In one aspect, the individual compartmentsmay have a volume of less than 5 nL. In some aspects, the wells may besealed with a permeable membrane prior to lysis of the cells or prior toperforming RT-PCR. In yet a further aspect, the individual compartmentsmay be microvesicles in an emulsion.

In further aspects aspect, sequestering single cells (and an mRNAcapture agent) and lysis of the cells (steps (a) and (b)) may beperformed concurrently. Thus, in some aspects, a method may compriseisolating single cells and an mRNA capture agents into individualmicrovesicles in an emulsion and in the presence of a cell lysissolution.

In further aspects, a method of the embodiments may comprise linkingcDNA by performing overlap extension reverse transcriptase polymerasechain reaction to link at least 2 transcripts into a single DNA molecule(e.g., in step (e)). In alternative aspects, step (e) may not comprisethe use of overlap extension reverse transcriptase polymerase chainreaction. In certain aspects, step (e) may comprise linking cDNA byperforming recombination.

In yet further aspects, sequestering the single cells may compriseintroducing cells to a device comprising a plurality of microwells sothat the majority of cells are captured as single cells (along with anmRNA capture agent, such as a bead). In further aspects, a method maycomprise sequencing of two or more transcripts covalently linked to thesame bead.

Thus, in some embodiments, a method is provided for obtaining aplurality of paired antibody VH and VL sequences wherein the cells areB-cells. In one aspect, the method is a method for obtaining pairedantibody VH and VL sequences for an antibody that binds to an antigen ofinterest. In certain aspects, the beads may be conjugated to the antigenof interest and the oligonucleotides only be conjugated to the beads inthe presence of an antibody that binds to the antigen of interest. Forexample, beads may be coated with an antigen of interest and the mRNAcapture agent (e.g., oligo-T) may associate with the bead only in thepresence of an antibody that binds to the antigen (see e.g., FIG. 10).For instance, the mRNA capture agent may be associated with protein-A orotherwise functionalized to bind to an antibody if present.

Certain aspects of the embodiments may concern obtaining a sample from asubject (e.g., a sample comprising cells for use in the methods of theembodiments). Samples can be directly taken from a subject or can beobtained from a third party. Samples include, but are not limited to,serum, mucosa (e.g., saliva), lymph, urine, stool, and solid tissuesamples. Similarly, certain aspects of the embodiments concernbiological fluids and antibodies and/or nucleic acids therefrom. Forexample, the biological fluid can be blood (e.g., serum), cerebrospinalfluid, synovial fluid, maternal breast milk, umbilical cord blood,peritoneal fluid, mucosal secretions, tears, nasal, secretions, saliva,milk, or genitourinary secretions. In certain aspects, cells for useaccording to the embodiments are mammalian cells, such as mouse, rat ormonkey cells. In preferred aspects the cells are human cells.

In some aspects, cells for use in the embodiments B cells, such as Bcells from a selected organ, such as bone marrow. For example, the Bcells can be mature B cells, such as bone marrow plasma cells, spleenplasma cells, or lymph node plasma cells, or cells from peripheral bloodor a lymphoid organ. In certain aspects, B cells are selected orenriched based on differential expression of cell surface markers (e.g.,Blimp-1, CD138, CXCR4, or CD45). In some cases, sequences of a selectedclass of antibodies are obtained, such as IgE, IgM, IgG, or IgAsequences.

In further aspects, a method of the embodiments may comprise immunizingthe subject (e.g., prior to obtaining a cell sample). The method mayfurther comprise isolation of a lymphoid tissue. The lymphoid tissueisolation may at least or about 1, 2, 3, 4, 5, 6, 6, 8, 9, 10 days orany intermediate ranges after immunization. The method may furthercomprise obtaining a population of nucleic acids of lymphoid tissue,preferably without separating B cells from the lymphoid tissue. Thelymphoid tissue may be a primary, secondary, or tertiary lymphoidtissue, such as bone marrow, spleen, or lymph nodes. The subject may beany animal, such as mammal, fish, amphibian, or bird. The mammal may behuman, mouse, primate, rabbit, sheep, or pig.

For determining the nucleic acid sequences (e.g., in the B cells or inlymphoid tissues), any nucleic acid sequencing methods known in the artmay be used, including high-throughput DNA sequencing. Non-limitingexamples of high-throughput sequencing methods comprisesequencing-by-synthesis (e.g., 454 sequencing), sequencing-by-ligation,sequencing-by-hybridization, single molecule DNA sequencing, multiplexpolony sequencing, nanopore sequencing, or a combination thereof.

In a further embodiment, the present invention provides a systemcomprising (a) an aqueous fluid phase exit disposed within an annularflowing oil phase; and (b) an aqueous fluid phase, wherein the aqueousphase fluid comprises a suspension of cells and is dispersed within theflowing oil phase, resulting in emulsified droplets with low sizedispersity comprising an aqueous suspension of cells. In one aspect, theaqueous fluid phase exit is a needle. In a further aspect, the aqueousfluid phase exit is a glass tube. In certain aspects, the oil phaseflows through a glass tube or polymeric tubing. In certain aspects, theaqueous phase flows through polymeric tubing. In still a further aspect,the concentration of cells, aqueous fluid phase flow rate, and oil phaseflow rate allow for the formation of droplets, wherein each dropletcontains a single cell. In some aspects, the cells are selected from thegroup consisting of: B cells, T cells, NKT cells, and cancer cells. Incertain aspects, the aqueous fluid phase comprises beads for nucleicacid capture reverse transcription reagents, polymerase chain reactionreagents, and/or combinations thereof.

In yet a further embodiment, the present invention provides acomposition comprising (a) a bead; (b) an oligonucleotide capable ofbinding mRNA; and (c) two or more primers specific for a transcript ofinterest.

In still a further embodiments embodiment, the present inventionprovides a composition comprising an emulsion having a plurality ofindividual microvesicles, said microvesicles comprising a bead withimmobilized oligonucleotides for priming of reverse transcription andindividual B-cells, which have been disrupted to release mRNAtranscripts.

In certain embodiments, the present invention provides a methodcomprising (a) adding a common sequence to the 5′ region of two or moreoligonucleotides that are specific to a set of gene targets; (b)performing nucleic acid amplification of the set of gene targets bypriming the common sequence; and (c) including in the nucleic acidamplification oligonucleotides comprising the common sequenceimmobilized onto a surface such that immobilized oligonucleotides primenucleic acid amplification, and resulting in surface capture ofamplified sequences.

As used herein the specification, “a” or “an” may mean one or more. Asused herein in the claim(s), when used in conjunction with the word“comprising”, the words “a” or “an” may mean one or more than one.

The use of the term “or” in the claims is used to mean “and/or” unlessexplicitly indicated to refer to alternatives only or the alternativesare mutually exclusive, although the disclosure supports a definitionthat refers to only alternatives and “and/or.” As used herein “another”may mean at least a second or more.

Throughout this application, the term “about” is used to indicate that avalue includes the inherent variation of error for the device, themethod being employed to determine the value, or the variation thatexists among the study subjects.

Other objects, features and advantages of the present invention willbecome apparent from the following detailed description. It should beunderstood, however, that the detailed description and the specificexamples, while indicating preferred embodiments of the invention, aregiven by way of illustration only, since various changes andmodifications within the spirit and scope of the invention will becomeapparent to those skilled in the art from this detailed description.

BRIEF DESCRIPTION OF THE DRAWINGS

The following drawings form part of the present specification and areincluded to further demonstrate certain aspects of the presentinvention. The invention may be better understood by reference to one ormore of these drawings in combination with the detailed description ofspecific embodiments presented herein.

FIG. 1 shows cells isolated into individual sealed wells. The small,spherical objects within the wells are beads. This image is takenthrough the dialysis membrane. Well diameter is approximately 56 μm.

FIG. 2 shows Left: An isolated single cell immediately prior to lysis;Center: The cell in the process of lysing; and Right: The microwellimmediately after lysis, using time-lapse microscopy. Well diameter isapproximately 56 μm.

FIG. 3 shows linked OE RT-PCR product. Letters indicate approximatelocations of constant, variable, joining, and diversity regions, whilenumbers indicate approximate locations of complementarity-determiningregions.

FIGS. 4A-4C show an overview of the linkage (overlap extension) RT-PCRprocess. 4A) V-region primers with a 5′ complementary heavy/lightoverlap region anneal to first strand cDNA. 4B) Second strand cDNA isformed by 5′ to 3′ extension; the overlap region is incorporated intoall cDNA. 4C) After denaturation, heavy and light chains with firststrand sense anneal to generate a complete 850 bp product through 5′ to3′ extension. The CDR-H3 and CDR-L3 are located near the outside of thefinal linked construct, which allows CDR3 analysis by 2×250 paired-endIllumina sequencing.

FIG. 5 shows MOPC-21 cells viably encapsulated in droplets formed viaflow focusing. The two input streams to the flow focusing device werecomprised of equal parts MOPC-21 cells in PBS (100,000 cells/mL, cellstream) and 0.4% trypan blue in PBS (dye stream), and the cell streamand dye streams mixed together immediately prior to the point ofemulsion droplet formation. MOPC-21 cells were shown to exclude trypanblue, demonstrating viable encapsulation of single cells within theemulsion droplets.

FIG. 6 presents an overview of high-throughput sequencing technology formultiple transcripts applied toward the sequencing of native antibody VHand VL mRNAs from B-cell populations. i) B-cell populations are sortedfor desired phenotype (e.g., mBCs, memory B cells, naive BCs, naive Bcells). ii) Single cells are isolated by random settling into amicrowell array; poly(dT) microbeads are also added to the wells. iii)Wells are sealed with a dialysis membrane and equilibrated with lysisbuffer to lyse cells and anneal VH and VL mRNAs to poly(dT) beads (blobrepresents a lysed cell, circles depict magnetic beads, black linesdepict mRNA strands). iv) Beads are recovered and emulsified for cDNAsynthesis and linkage PCR to generate an ˜850-base pair VH:VL cDNAproduct. v) Next-generation sequencing is performed to sequence thelinked strands. vi) Bioinformatic processing is used to analyze thepaired VH:VL repertoire.

FIGS. 7A-7C shows amplification of heavy and light chain DNA onoligoimmobilized magnetic beads for high-throughput sequencing. 7A)Beads display a mix of 3 immobilized oligonucleotides: poly(T) for mRNAcapture, AHX89 for heavy chain amplification, and BRHO6 for light chainamplification. 7B) Reverse transcription is initiated from captured mRNA(represented by gray dashed lines) that has annealed to immobilizedpoly(T) oligonucleotides. Specially designed immunoglobulin constantregion reverse transcription primers have either AHX89 at the 5′ end(for heavy chain) or BRH06 (for light chain). Reverse transcriptionpolymerase chain reaction occurs inside emulsion droplets. 7C) V regionforward primers have either an <F3> sequence at the 5′ end (heavy chain)or <F5> sequence (light chain) which will be used to initiatepyrosequencing. cDNA strands are displayed as black lines.

FIG. 8 shows a diagram of the nozzle/carrier stream apparatus. A glasscapillary tube supplies an outer oil phase carrier stream (arrows) thatsurrounds a needle exit. The needle injects aqueous phase containingcells, and monodisperse droplets are generated by shear forces fromannular oil phase flow.

FIG. 9 shown a general decision tree algorithm for pairing of VH and VLsequences.

FIGS. 10A-10C show an exemplary process of mRNA capture from isolatedsingle cells encoding high-affinity antibodies for a particular antigen.(10A) Antibody-secreting B cells (top left) are isolated intocompartments containing beads with immobilized antigen. Secretedantibody (gray) is captured by the beads if the B cell encodes ahigh-affinity antibody for the antigen. (10B) Any unbound cell-secretedantibodies are washed away and an anti-IgG antibody (white) with linkedpoly(dT) ssDNA (black strands) is added to the compartment. Theanti-IgG:poly(dT) (or other mRNA capture moiety) construct isimmobilized on beads containing captured antibody. poly(dT) ssDNA isco-localized only with cells that secrete high-affinity antibody to thedesired antigen. (10C) The compartments are sealed and cells are lysed.mRNA strands (small circles) released from cells which secretedhigh-affinity antibody are captured via hybridization to the poly(dT) onpoly(dT):antibody:bead constructs. Next, beads can be recovered forsingle-cell mRNA transcript analysis.

DESCRIPTION OF ILLUSTRATIVE EMBODIMENTS

The present disclosure generally relates to sequencing two or more genesexpressed in a single cell in a high-throughput manner. Moreparticularly, the present disclosure provides a method forhigh-throughput sequencing of pairs of transcripts co-expressed insingle cells to determine pairs of polypeptide chains that compriseimmune receptors (e.g., antibody VH and VL sequences).

The methods of the present disclosure allow for the repertoire of immunereceptors and antibodies in an individual organism or population ofcells to be determined. Particularly, the methods of the presentdisclosure may aid in determining pairs of polypeptide chains that makeup immune receptors. B cells and T cells each express immune receptors;B cells express immunoglobulins, and T cells express T cell receptors(TCRs). Both types of immune receptors consist of two polypeptidechains. Immunoglobulins consist of variable heavy (VH) and variablelight (VL) chains. TCRs are of two types: one consisting of an α and a βchain, and one consisting of a γ and a δ chain. Each of the polypeptidesin an immune receptor has constant region and a variable region.Variable regions result from recombination and end joint rearrangementof gene fragments on the chromosome of a B or T cell. In B cellsadditional diversification of variable regions occurs by somatichypermutation. Thus, the immune system has a large repertoire ofreceptors, and any given receptor pair expressed by a lymphocyte isencoded by a pair of separate, unique transcripts. Only by knowing thesequence of both transcripts in the pair can one study the receptor as awhole. Knowing the sequences of pairs of immune receptor chainsexpressed in a single cell is also essential to ascertaining the immunerepertoire of a given individual or population of cells.

Currently available methods to analyze multiple transcripts in singlecells, such as the two transcripts that comprise adaptive immunereceptors, are limited by low throughput and very high instrumentationand reagent costs. No technology currently exists for rapidly analyzinghow many cells express a set of transcripts of interest or, morespecifically, for sequencing native lymphocyte receptor chain pairs atvery high throughput (greater than 10,000 cells per run). The presentdisclosure aims to correct these deficiencies by providing a newtechnique for sequencing multiple transcripts simultaneously at thesingle-cell level with a throughput two to three orders of magnitudegreater than the current state of the art.

One advantage of the methods of the present disclosure is that themethods result in a higher throughput several orders of magnitude largerthan the current state of the art. In addition, the present disclosureallows for the ability to link two transcripts for large cellpopulations in a high throughput manner, faster and at a much lower costthan competing technologies.

In certain embodiments, the present disclosure provides methodscomprising separating single cells in a compartment with beadsconjugated to oligonucleotides; lysing the cells; allowing mRNAtranscripts released from the cells to hybridize with theoligonucleotides; performing overlap extension reverse transcriptasepolymerase chain reaction to covalently link DNA from at least twotranscripts derived from a single cell; and sequencing the linked DNA.In certain embodiments, the cells may be mammalian cells. In certainembodiments, the cells may be B cells, T cells, NKT cells, or cancercells.

In other embodiments, the present disclosure provides methods comprisingseparating single cells in a compartment with beads conjugated tooligonucleotides; lysing the cell; allowing mRNA transcripts releasedfrom the cells to hybridize with the oligonucleotides conjugated to thebeads; performing reverse transcriptase polymerase chain reaction toform at least two cDNAs from at least two transcripts derived from asingle cell; and sequencing the cDNA attached to the beads.

In another embodiment, the present disclosure provides a methodcomprising mixing cells with beads having a diameter smaller than thediameter of the cells, wherein the beads are conjugated tooligonucleotides, sequestering the cells and beads within compartmentshaving a volume of less than 5 nL, lysing the cells and allowing mRNAtranscripts to associate with the beads, isolating the beads andassociated mRNA from the compartments, performing reverse transcriptionfollowed by PCR amplification on the bead-associated mRNA, andsequencing the DNA product from each bead to identify cDNA associatedwith each bead.

In other embodiments, the present disclosure provides a systemcomprising an aqueous fluid phase exit disposed within an annularflowing oil phase, wherein the aqueous phase fluid comprises asuspension of cells and is dispersed within the flowing oil phase,resulting in emulsified droplets with low size dispersity comprising anaqueous suspension of cells.

In other embodiments, the present disclosure provides a compositioncomprising a bead, an oligonucleotide capable of binding mRNA, and twoor more primers specific for a transcript of interest.

In certain embodiments, the present disclosure also provides for adevice comprising ordered arrays of microwells, each with dimensionsdesigned to accommodate a single lymphocyte cell. In one embodiment, themicrowells may be circular wells 56 μm in diameter and 50 μm deep, for atotal volume of 125 pL. Such microwells would normally range in volumefrom 20-3,000 pL, though a wide variety of well sizes, shapes anddimensions may be used for single cell accommodation. In certainembodiments, the microwell may be a nanowell. In certain embodiments,the device may be a chip. The device of the present disclosure allowsthe direct entrapment of tens of thousands of single cells, with eachcell in its own microwell, in a single chip. In certain embodiments, thechip may be the size of a microscope slide. In one embodiment, amicrowell chip may be used to capture single cells in their ownindividual microwells (FIG. 6). The microwell chip can be made frompolydimethylsiloxane (PDMS); however, other suitable materials known inthe art such as polyacrylimide, silicon and etched glass may also beused to create the microwell chip.

Several beads or other particles conjugated with oligonucleotides mayalso be captured in the microwells with the single cells according tothe methods of the present disclosure. In certain embodiments, beads maycomprise oligonucleotides immobilized on the surface of the beads. Inother embodiments, the beads may be magnetic. In other embodiments, thebeads may be coated with one or more oligonucleotides. In certainembodiments, the oligonucleotides may be a poly(T), a sequence specificfor heavy chain amplification, and/or a sequence specific for lightchain amplification. A dialysis membrane covers the microwells, keepingthe cells and beads in the microwells while lysis reagents are dialyzedinto the microwells. The lysis reagents cause the release of the cells'mRNA transcripts into the microwell with the beads. In embodiments wherethe oligonucleotide is poly(T), the poly(A) mRNA tails are captured bythe poly(T) oligonucleotides on the beads. Thus, each bead is coatedwith mRNA molecules from a single cell. The beads are then pooled,washed, and resuspended in solution with reagents for overlap extension(OE) reverse transcriptase polymerase chain reaction (RT-PCR). Thisreaction mix includes primers designed to create a single PCR productcomprising cDNA of two transcripts of interest covalently linkedtogether. Before thermocycling, the reagent solution/bead suspension isemulsified in oil phase to create droplets with no more than one beadper droplet. The linked cDNA products of OE RT-PCR are recovered andused as a template for nested PCR, which amplifies the linkedtranscripts of interest. The purified products of nested PCR are thensequenced and pairing information is analyzed (FIG. 6). In otherembodiments, restriction and ligation may be used to link cDNA ofmultiple transcripts of interest. In other embodiments, recombinationmay be used to link cDNA of multiple transcripts of interest.

The present disclosure also provides a method to trap mRNA from singlecells on beads, perform cDNA synthesis, link the sequences of two ormore desired cDNAs from single cells to create a single molecule, andfinally reveal the sequence of the linked transcripts by High Throughput(Next-gen) sequencing. According to the present disclosure, one way toincrease throughput in biological assays is to use an emulsion thatgenerates a high number of 3-dimensional parallelized microreactors.Emulsion protocols in molecular biology often yield 109-1011 dropletsper mL (sub-pL volume). Emulsion-based methods for single-cellpolymerase chain reaction (PCR) have found a wide acceptance, andemulsion PCR is a robust and reliable procedure found in manynext-generating sequencing protocols. However, very high throughputRT-PCR in emulsion droplets has not yet been implemented because celllysates within the droplet inhibit the reverse transcriptase reaction.Cell lysate inhibition of RT-PCR can be mitigated by dilution to asuitable volume.

In another embodiment, cells are lysed in emulsion droplets containingbeads for nucleic acid capture. In certain embodiments, the beads may beconjugated with oligonucleotide. In certain embodiments, theoligonucleotide may be poly(T). In other embodiments, theoligonucleotide may be a primer specific to a transcript of interest. Incertain embodiments, the bead may be magnetic. An aqueous solution witha suspension of both cells and beads is emulsified into oil phase byinjecting an aqueous cell/bead suspension into a fast-moving stream ofoil phase. The shear forces generated by the moving oil phase createdroplets as the aqueous suspension is injected into the stream, creatingan emulsion with a low dispersity of droplet sizes. Each cell is in itsown droplet along with several beads conjugated with oligonucleotides.The uniformity of droplet size helps to ensure that individual dropletsdo not contain more than one cell. Cells are then thermally lysed, andthe mixture is cooled to allow the beads to capture mRNA. The emulsionis broken and the beads are collected. The beads are resuspended in asolution for emulsion OE RT-PCR to link the cDNAs of transcripts ofinterest together. Nested PCR and sequencing of the linked transcriptsis performed according to the present disclosure. In certainembodiments, the aqueous suspension of cells comprises reversetranscription reagents. In certain other embodiments, the aqueoussuspension of cells comprises at least one of polymerase chain reactionand reverse transcriptase polymerase chain reaction reagents. In otherembodiments, restriction and ligation may be used to link cDNA ofmultiple transcripts of interest. In other embodiments, recombinationmay be used to link cDNA of multiple transcripts of interest.

In another embodiment, emulsion droplets which contain individual cellsand RT-PCR reagents are formed by injection into a fast-moving oilphase. Thermal cycling is then performed on these droplets directly. Incertain embodiments, an overlap extension reverse transcriptionpolymerase chain reaction may be used to link cDNA of multipletranscripts of interest.

In another embodiment, cDNAs of interest from a single cell are attachedvia RT-PCR to beads as described below, and the transcripts on the beadsare sequenced directly using high-throughput sequencing. An equalmixture of three species of functionalized oligonucleotide primers maybe conjugated to functionalized beads. One of the oligonucleotides maybe poly(T) to capture the poly(A) tail of mRNAs. The other twooligonucleotides may be specific primers for amplifying the transcriptsof interest. Beads prepared in this way are mixed with cells in anaqueous solution, and the cell/bead suspension is emulsified so thateach cell is in its own droplet along with an excess of beads. Incertain embodiments an average of 55 beads may be contained in eachdroplet. Cells are thermally lysed, and poly(T) oligonucleotides on thebeads bind mRNAs. The emulsion is broken, and beads are collected,washed, and resuspended in a solution with reagents and primers forRT-PCR that will result in amplification of the transcripts of interestin such a way that the transcripts are attached to the beads. The beadsuspension is emulsified and RT-PCR is performed. The beads arecollected and submitted for high-throughput sequencing, which directlysequences the two transcripts attached to the beads by initiatingmultiple sequence reads using at least two different primers, where eachinitiation primer is specific to a transcript of interest. The twotranscripts are paired by bead location in the high-throughputsequencing grid, revealing sequences that are expressed together from asingle cell. Sequencing can be performed, for example, on AppliedBiosystem's SOLiD platform, Life Technologies' Proton Torrent, orIllumina's HiSeq sequencing platform.

Primer design for OE RT-PCR determines which transcripts of interestexpressed by a given cell are linked together. For example, in certainembodiments, primers can be designed that cause the respective cDNAsfrom the VH and VL chain transcripts to be covalently linked together.Sequencing of the linked cDNAs reveals the VH and VL sequence pairsexpressed by single cells. In other embodiments, primer sets can also bedesigned so that sequences of TCR pairs expressed in individual cellscan be ascertained or so that it can be determined whether a populationof cells co-expresses any two genes of interest.

Bias can be a significant issue in PCR reactions that use multipleamplification primers because small differences in primer efficiencygenerate large product disparities due to the exponential nature of PCR.One way to alleviate primer bias is by amplifying multiple genes withthe same primer, which is normally not possible with a multiplex primerset. By including a common amplification region to the 5′ end ofmultiple unique primers of interest, the common amplification region isthereby added to the 5′ end of all PCR products during the firstduplication event. Following the initial duplication event,amplification is achieved by priming only at the common region to reduceprimer bias and allow the final PCR product distribution to remainrepresentative of the original template distribution.

Such a common region can be exploited in various ways. One clearapplication is to add the common amplification primer at higherconcentration and the unique primers (with 5′ common region) at a lowconcentration, such that the majority of nucleic acid amplificationoccurs via the common sequence for reduced amplification bias. Anotherapplication is the surface-based capture of amplification products, forexample to capture PCR product onto a microbead during emulsion PCR. Ifthe common sequence oligonucleotides are immobilized onto a beadsurface, the PCR products of interest will become covalently linked tothe bead during amplification. In this way, a widely diverse set oftranscripts can be captured onto a surface using a single immobilizedoligonucleotide sequence.

For example, two different common regions may be immobilized onto a beadsurface at equal concentration (e.g., one common sequence for heavychain, and a different common sequence for light chain). Following PCRamplification, the bead will be coated with approximately 50% heavychain amplification product, and 50% light chain amplification product.This balance between heavy and light chain representation on the beadsurface helps ensure sufficient signal from both heavy and light chainswhen the bead is submitted to high throughput sequencing.

Accordingly, in certain embodiments, the present disclosure providesmethods comprising adding a common sequence to the 5′ region of two ormore oligonucleotides that are specific to a set of gene targets; andperforming nucleic acid amplification of the set of gene targets bypriming the common sequence. In certain embodiments, the common sequencen is immobilized onto a surface. In other embodiments, the commonsequence may be used to capture amplification products.

The methods of the present disclosure allow for information regardingmultiple transcripts expressed from a single cell to be obtained. Incertain embodiments, probabilistic analyses may be used to identifynative pairs with read counts or frequencies above non-native pair readcounts or frequencies. The information may be used, for example, instudying gene co-expression patterns in different populations of cancercells. In certain embodiments, therapies may be tailored based on theexpression information obtained using the methods of the presentdisclosure. Other embodiments may focus on discovery of new lymphocytereceptors.

EXAMPLES

The following examples are included to demonstrate preferred embodimentsof the invention. It should be appreciated by those of skill in the artthat the techniques disclosed in the examples which follow representtechniques discovered by the inventor to function well in the practiceof the invention, and thus can be considered to constitute preferredmodes for its practice. However, those of skill in the art should, inlight of the present disclosure, appreciate that many changes can bemade in the specific embodiments which are disclosed and still obtain alike or similar result without departing from the spirit and scope ofthe invention.

Example 1—Construction of a High Density Microwell Plate

A grid of micropillars (56 μm diameter, 50 μm height) arephotolithographically patterned onto a silica wafer using SU-8photoresist (Fisher Scientific) and the silica wafer is used as a moldto print polydimethylsiloxane (PDMS) chips (Sylgard 184, Dow Corning)with the dimensions of a standard microscope slide and containingapproximately 170,000 wells per chip. Dimensions of the micropillar mayrange from about 5 μm to about 300 μm wide and from about 5 μm to about300 μm high. Molded PDMS chips are silanized in an oxygen plasma chamberfor 5 minutes to generate a hydrophilic surface. The PDMS chips are thenblocked in 1% bovine serum albumin (BSA) for 30 minutes and washed indeionized water and phosphate-buffered saline (PBS) to prepare for cellseeding.

Example 2—Method for Linking Two Transcripts from a Single Cell in aHigh Throughput Manner

The process for physically linking two or more transcripts derived froma single cell in a high throughput manner uses the sealed PDMS microwelldevice of Example 1 to trap single cells into separate wells. Cell lysisalso occurs, and poly(T) magnetic micron size beads for mRNA capture arealso introduced into the microwells. Once cells and beads have beenloaded, the device is sealed with dialysis membrane and a lysis solutionis introduced. Subsequently, beads are recovered, resuspended insolution with reagents, primers and polymerase enzyme for overlapextension (OE) RT-PCR, and the solution is then emulsified so that eachbead is encapsulated within a single emulsion droplet. The emulsion issubjected to thermal cycling to physically link the two transcripts(e.g., immunoglobulin heavy and light chain cDNA), and the linkedproducts are recovered from the emulsion following cycling. A nested PCRamplification is performed, and then the resulting DNA is sequencedusing Illumina or any other NextGen sequencing technology that can yieldreads of appropriate length to unequivocally interpret the transcriptpairing information (FIG. 6).

The method outlined above was employed to link the immunoglobulinvariable heavy (VH) and variable light (VL) chains in mixturescomprising the mouse hybridoma cell lines MOPC-21 and MOPC-315. The VHand VL sequences expressed by each of these cell lines is known andhence these experiments served for method validation. 5 mL each ofMOPC-21 and MOPC-315 cells were separately withdrawn from culture twodays after passage (cells were grown in Falcon vented T-25 cultureflasks, 10 mL volume in RPMI-1640, 10% FBS, 1% P/S) and placed in 15 mLtubes. Cell density of 150,000 viable cells/mL with >98% viability, asmeasured with a hemocytometer and trypan blue exclusion, weredetermined. RNAse A was added to each tube at a concentration of 30μg/mL and cells were incubated at 37° C. for 30 minutes. Then, cellswere washed three times with complete culture media and twice with PBS(pH 7.4). Washes were accomplished by centrifugation at 250 g at roomtemperature for 5 minutes followed by aspiration and resuspension. Cellconcentrations were counted again with a hemocytometer, and MOPC-21 andMOPC-315 cells were mixed to form a cell suspension with a totalconcentration of 35,000 cells/mL in PBS, composed of 17,500 MOPC-21cells/mL and 17,500 MOPC-315 cells/mL.

500 μL of the MOPC-21 and MOPC-315 cell mixture were applied to a PDMSmicrowell device that had been incubated with BSA to block non-specificadsorption. 17,500 total cells were added to each chip. Four chips wereused in parallel (70,000 total cells distributed across four PDMSchips), and cells were allowed to settle into wells by gravity over thecourse of 5 minutes with gentle agitation. As each PDMS chip containsapproximately 120,000 wells and cell loading efficiently is estimated atapproximately 70%, approximately 1 in 10 wells contain isolated cells.The incidence of two cells per well can be accurately estimated withPoisson statistics, and under these conditions, >95% of wells containingcells contained a single cell.

The surfaces of the microwell devices were then washed with PBS toremove unadsorbed cells from the chip surfaces, and 25 μL of poly(T)magnetic beads (mRNA Direct Kit, 2.8 μm diameter, Invitrogen Corp.) wasresuspended in 50 μL PBS and applied to each microwell device surface,for an average of 55 poly(T) beads per well. After magnetic beads wereallowed to settle into wells by gravity, a BSA-blocked dialysis membrane(12,000-14,000 MWCO regenerated cellulose, 25 mm flat width, FisherScientific) that had been rinsed in PBS was laid over each chip surface.PBS was removed from the chip and membrane surfaces using a 200 μLpipette. Then, the tapered end of a 1000 μL pipette tip was cut to forma flat cylinder that was dragged across the membranes, pressing themembranes to the PDMS chips and eliminating excess PBS from between thePDMS microwell devices and dialysis membrane, which sealed themicrowells and trapped cells and beads inside (FIG. 1).

Cell lysis and mRNA binding to the poly(T) magnetic beads trapped withinmicrowells was accomplished by dialysis. 500 μL of cell lysis solution(500 mM LiCl in 100 mM tris buffer (pH 7.5) with 0.1% sodiumdeoxycholate and 10 mM ribonucleoside vanadyl complex) was applied tothe dialysis membranes, and lysis occurred at room temperature asreagents dialyzed into microwells. Cell lysis was fully complete in <5minutes as determined by time-lapse microscopy (FIG. 2).

PDMS microwell chips were maintained for 20 minutes at room temperatureinside a Petri dish, then placed in a cold room at 4° C. for 10additional minutes. A Dynal MPC-S magnet was placed underneath the PDMSmicrowell device to hold magnetic beads inside microwells as thedialysis membrane was removed with forceps and discarded. The magnet wasthen placed underneath another Petri dish with 4 subdivisions, one ofwhich contained 2 mL of cold mRNA Direct Lysis/Binding Buffer (100 mMtris pH 7.5, 500 mM LiCl, 10 mM EDTA, 1% LiDS, 5 mM DTT). The four PDMSmicrowell devices were sequentially inverted and resuspended in the 2 mLof solution to allow the magnet to draw beads out of microwells and intothe mRNA Direct Lysis/Binding buffer solution. Magnetic beads were thenresuspended in the 2 mL mRNA Direct Lysis/Binding Buffer and thesolution was divided into two Eppendforf tubes and placed on the DynalMPC-S magnetic rack. Beads were washed once without resuspension using 1mL per tube of Wash Buffer 1 (100 mM tris pH 7.5, 500 mM LiCl, 1 mMEDTA, 4° C.). Beads were then immediately washed again in Wash Buffer 1with resuspension. Beads were then immediately resuspended in WashBuffer 2 (20 mM tris pH 7.5, 50 mM KCl, 3 mM MgCl) and replaced on themagnetic rack. Finally, beads were suspended in 2.85 mL cold RT-PCRmixture (Quanta OneStep Fast, VWR) containing 0.05 wt % BSA (InvitrogenUltrapure BSA, 50 mg/mL) and primer concentrations listed in Table 1.Amplification was accomplished with two common primers (CHrev-AHX89 andCLrev-BRH06) at high concentration which anneal to the reversecomplement of the 5′ end of CLrev and CHrev specific primers. V-regionprimers also contain linker sequences at the 5′ end to effect VH-VLlinkage. 25 μL of cold the RT-PCR mixture was previously reserved forcycling without beads or emulsification as a non-template control. Thecold RT-PCR mixture containing the poly(T) magnetic beads was addeddropwise to a stirring Ika dispersing tube (DT-20, VWR) containing 9 mLchilled oil phase (molecular biology grade mineral oil with 4.5%Span-80, 0.4% Tween 80, 0.05% Triton X-100, v/v % oil phase reagentsfrom Sigma Aldrich Corp.), and the mixture was agitated for 5 minutes atlow speed. The resulting emulsion was added to 96-well PCR plates, with100 μL emulsion per well, and placed in a thermocycler. The RT step wasperformed under the following conditions: 30 minutes at 55° C., followedby 2 min at 95° C. PCR amplification was then performed under thefollowing conditions: three cycles of 94° C. for 30 s denature, 57° C.for 1 min anneal, and 72° C. for 3 min extend; then twenty-seven cyclesof 94° C. for 30 s denature, 59° C. for 30 s anneal, and 72° C. for 3min extend; then a final extension step for 7 min at 72° C. FIG. 3 showsa diagram of the final linked products.

TABLE 1 Primers for MOPC-21/MOPC-315 emulsion linkage RT-PCR. Conc.Primer ID 400 CLrev-BRH06 400 CHrev-AHX89 40 MOPC21-CHrev-AHX89 40MOPC21-CLrev-BRH06 40 MOPC315-CLrev-BRH06 40 MOPC315-CHrev-AHX89 40MOPC21-VH-OE2 40 MOPC21-VL-OE2 40 MOPC315-VH-OE 40 MOPC315-VL-OE

Following thermal cycling, the emulsion was collected and divided intothree Eppendorf tubes and centrifuged at room temperature for 10 minutesat 16,000 g. The mineral oil upper phase was discarded, and 1.5 mLdiethyl ether was added to extract the remaining oil phase and break theemulsion. The upper ether layer was removed and two more etherextractions were performed. Then the ether layer was discarded, andresidual ether solvent was removed in a SpeedVac for 25 minutes at roomtemperature. The remaining aqueous phase was diluted 5:1 in DNA bindingbuffer, then split in three parts and passed through three silica spincolumns (DNA Clean & Concentrator, Zymo Research Corp.) to capture theRT-PCR cDNA product. After washing each column with 300 μL wash buffer(Zymo Research Corp), cDNA was eluted with 20 μL in each column, and anested PCR reaction was performed (ThermoPol PCR buffer with TaqPolymerase, New England Biosciences) in a total volume of 200 μL using 4μL eluted cDNA as template. After a 2 min denaturing step at 94° C.,cycling was performed at 94° C. for 30 s denature, 62° C. for 30 sanneal, 72° C. for 20 s extend, for 30 cycles. 400 nM of each nestedprimer (Table 2) was used to amplify linked heavy and light chains,which generated an approximately 800 bp linked product.

TABLE 2 Primers for MOPC-21/MOPC-315 nested PCR. Conc. Primer ID 400MOPC21-CHrev- seq 400 MOPC21-CLrev- seq 400 MOPC315- CHrev-seq 400MOPC315- CLrev-seq

Nested PCR product was electrophoresed on a 1% agarose gel, and the 800bp band was excised and dissolved in agarose-dissolving buffer for 10minutes at 50° C., then captured onto and eluted from a silica spincolumn according to manufacturer protocols (Zymo Research Corp.) toobtain purified nested PCR product. Purified cDNA was submitted for basepair paired-end reads with the Illumina HiSeq sequencing platform. OtherNextGen sequencing technology (e.g. Roched 454, Pacific Biosciencesetc.) capable of providing reads suitable for identifying the linkedtranscript can also be used for this purpose. (FIG. 3). HiSeq dataoutput was mapped to known MOPC-21 and MOPC-315 sequences using theSHort Read Mapping Package software (SHRiMP) and filtered forhigh-quality reads with >90% identity to known transcript sequences. Inthis manner, approximately 18,000 linked heavy and light chain sequenceswere obtained (Table 3).

TABLE 3 Raw read counts for sequenced VH-VL pairs. Light MOPC-21MOPC-315 Heavy MOPC-21 9,689 426 MOPC-315 1,042 6,591

Correct transcript pairings were further determined from the degree ofpairing skewness of the raw DNA sequencing data. For any given twotranscripts, e.g., immunoglobulin heavy chain H_(i) and light chainL_(j), with overall heavy or light chain mapped frequencies f_(Hi), andf_(Lj), a measure of pairing skewness, s, is computed:

$s = {\frac{{Observed}\mspace{14mu} {reads}}{{Expected}\mspace{14mu} {reads}\mspace{14mu} ( {{random}\mspace{14mu} {pairing}} )} = \frac{ {\lbrack ( {\# {Hi}} ) \rbrack {Lj}\mspace{14mu} {pairs}} )}{{fHi} \times {fLj} \times ( {\# \mspace{14mu} {total}\mspace{14mu} {pairs}} )}}$

The calculated value s compares VL frequency paired with a particular VHto the VL frequency in the entire sequence set. A value of s>1 indicatesthat a heavy-light pair is observed at a frequency above thatcorresponding to random pairing. Natural pairings are deduced fromentries with a maximum value of s (Table 4). Pairing skewness, s, forsequenced heavy and light pairs, calculated from approximately 18,000sequenced VH-VL linked pairs are is shown in Table 4. Native heavy-lightpairings are predicted by the maximal value of s for each heavy chainand are highlighted in green. This table demonstrates the capacity ofour method to resolve native heavy and light chain pairings from aheterogeneous mixture of cells.

TABLE 4 Calculated Pairing Skewness. Light MOPC-21 MOPC-315 HeavyMOPC-21 1.58 0.11 MOPC-315 0.23 2.18

Example 3—High-Throughput Transcripts Pairing Analysis Using DefinedMixtures of 5 Cell Lines

Five immortalized B cell lines were mixed at different ratios and usedto examine pairing efficiency of the linked products generated byOE-PCR. The five B cell lines used in this experiment were: MOPC-21,MOPC-315, IM-9, ARH-77, and DB (see Table 5). DB expresses extremely lowlevels of VH and VL transcript and was used as a negative control.

All cell lines were obtained from ATCC and cultured in RPMI-1640supplemented with 10% FBS and 1% penicillin/streptomycin (see Example2). Following a 30-minute RNAse treatment and subsequent wash, cellswere seeded into microwells at a density of 17,500 total cells per chipalong with poly(T) magnetic beads according to Example 2. Wells weresealed with a dialysis membrane, cells were lysed, and mRNA was allowedto anneal to the beads (Example 2). Beads were then recovered,resuspended in OE RT-PCR mix, and placed in an emulsion (Example 2). OERT-PCR primer concentrations used are given in Table 6, and thermalcycling conditions are presented in Table 7.

TABLE 5 An overview of the 5 cell lines. % in Relative Ig Mix Cell LineATCC ID Organism Ig Class Expression 65 IM-9 CCL-159 Homo sapiensIgG/IgK Low 35 MOPC-21 63035 Mus musculus IgG/IgK Medium 6 ARH-77CRL-1621 Homo sapiens IgG/IgK High 3 MOPC-315 TIB-23 Mus musculusIgA/IgL Medium 1 DB CRL-2289 Homo sapiens IgG/IgL Very Low

TABLE 6 OE RT-PCR primers for the mix of cell lines. Conc. Primer ID 400CLrev-BRH06 400 CHrev-AHX89 40 MOPC21-CHrev- AHX89 40 MOPC21-CLrev-BRH06 40 MOPC315-CLrev- BRH06 40 MOPC315-CHrev- AHX89 40 MOPC21-VH-OE240 MOPC21-VL-OE2 40 MOPC315-VH-OE 40 MOPC315-VL-OE 40 hIgG-rev-OE- AHX8940 hIgKC-rev-OE- BRH06 40 hIgLC-rev-OE- BRH06 40 hVH1-fwd-OE 40hVH157-fwd-OE 40 hVH2-fwd-OE 40 hVH3-fwd-OE 40 hVH4-fwd-OE 40hVH4-DP63-fwd- OE 40 hVH6-fwd-OE 40 hVH3N-fwd-OE 40 hVK1-fwd-OE 40hVK2-fwd-OE 40 hVK3-fwd-OE 40 hVK5-fwd-OE 40 hVL1-fwd-OE 40hVL1459-fwd-OE 40 hVL15910-fwd-OE 40 hVL2-fwd-OE 40 hVL3-fwd-OE 40hVL-DPL16-fwd- OE 40 hVL3-38-fwd-OE 40 hVL6-fwd-OE 40 hVL78-fwd-OE

TABLE 7 OE RT-PCR thermal cycling conditions. # Cycles Temp (° C.) Time(min) 1 55 30 94 2 4 94 0.5 50 0.5 72 3 4 94 0.5 55 0.5 72 3 22 94 0.560 0.5 72 3 1 72 7

Emulsion OE RT-PCR product was recovered by diethyl ether extractionfollowed by capture on and elution from a silica spin column (Example 2)for use as template in a nested PCR under the following conditions: 94°C. for 2 min initial denature, 94° C. for 30 s denature, 62° C. for 30 sanneal, 72° C. for 20 s extend, 40 total cycles. Nested primer sequencesand concentrations are reported in Tables 2 and 8.

TABLE 8 Nested PCR primers to generate approximately 800 bp linkedproducts. Conc. (nM) Primer ID 400 hIgG-all-rev- OEnested 400 hIgKC-rev-OEnested 400 hIgLC-rev- OEnested

Nested PCR product was electrophoresed on a 1% agarose gel, and a regionfrom 650 to 1000 bp was excised and purified with a silica spin column(Example 2). Recovered cDNA was submitted for Illumina HiSeq 100 bppaired-end sequencing. HiSeq data was mapped to a reference filecontaining heavy and light chain sequences for all five clones, and datawas filtered to obtain paired-end reads with >90% match to referencesequences, as in Example 2. Natural pairings were identified byinterrogating skewness of pairing data. For any given immunoglobulinheavy chain Hi and light chain Lj, with overall heavy or light chainmapped frequencies fHi and fLj, a measure of pairing skewness, s, wascomputed:

$s = {\frac{{Observed}\mspace{14mu} {reads}}{{Expected}\mspace{14mu} {reads}\mspace{14mu} ( {{random}\mspace{14mu} {pairing}} )} = \frac{( {\# {HiLj}\mspace{14mu} {pairs}} )}{{fHi} \times {fLj} \times ( {\# \mspace{14mu} {total}\mspace{14mu} {pairs}} )}}$

The calculated value s compares VL frequency paired with a particular VHto the VL frequency in the entire sequence set. A value of s>1 indicatesthat a heavy-light pair is observed at a frequency above thatcorresponding to random pairing. Natural pairings are deduced fromentries with a maximum value of s for each heavy chain. Table 9 showsthe natural pairings identified and pairing skewness, s, for sequencedheavy and light pairs, calculated from approximately 66,000 sequencedVH-VL linked pairs. Native heavy-light pairings were predicted by themaximal value of s for each heavy chain and are highlighted in gray.Table 9 demonstrates the ability of our method to resolve native heavyand light chain pairings from a heterogeneous mixture of cells with highthroughput.

Example 4—Method for Linking Two Transcripts from Single B Cells Trappedwithin High Density Microwell Plates

A population of B cells is allowed to settle by gravity into PDMSmicrowell plates, constructed as described in Example 1. In thisexample, each PDMS slide contains 1.7×10⁵ wells so that four slidesprocessed concurrently accommodate 68,000 lymphocytes at a ≧1:10cell/well occupancy, which gives at least a 95% probability of therebeing only one cell per well based on Poisson statistics. Poly(dT)magnetic beads with a diameter of 2.8 μm are deposited into themicrowells at an average of 55 beads/well and the slides are coveredwith a dialysis membrane. Subsequently, the membrane-covered slides areincubated with an optimized cell lysis solution containing 1% lithiumdodecyl sulfate that results in complete cell lysis within <1 min. mRNAanneals to the poly(dT) magnetic beads which are collected, washed andresuspended in solution with reagents, primers, reverse transcriptaseenzyme, and polymerase enzyme for overlap extension (OE) RT-PCR. In thismanner beads become isolated within the droplets that comprise the waterin oil emulsion. The emulsion is subjected to thermal cycling tophysically link the two transcripts (e.g. immunoglobulin heavy and lightchain cDNA), and the linked products are recovered from the emulsionfollowing cycling. A nested PCR amplification is performed, and then theresulting DNA is sequenced using Illumina or any other NextGensequencing technology that can yield reads of appropriate length tounequivocally interpret the transcript pairing information. An overviewof the process is presented in FIG. 6.

The method outlined above was employed to link the immunoglobulinvariable heavy (VH) and variable light (VL) chains in mixtures of humanprimary cells.

A healthy 30-year-old male was vaccinated with the 2010-2011 trivalentFluVirin influenza vaccine (Novartis) and blood was drawn at day 14after vaccination after informed consent had been obtained. PBMCs wereisolated and resuspended in DMSO/10% FCS for cryopreservation. FrozenPBMCs were thawed and cell suspensions were stained in PBS/0.2% BSA withanti-human CD19 (HIB19, BioLegend, San Diego, Calif.), CD27 (O323,BioLegend), CD38 (HIT2, BioLegend) and CD3 (7D6, Invitrogen, GrandIsland, N.Y.). CD19⁺CD3⁻CD27⁺CD38^(int) memory B cells were sorted usinga FACSAria II sorter system (BD Biosciences, San Diego, Calif.). Cellswere either cryopreserved in DMSO/10% FCS for subsequent high-throughputVH:VL pairing or single-cell sorted into 96-well plates containing RNAseInhibitor Cocktail (Promega, Madison, Wis.) and 10 mM Tris-HCl pH 8.0for single-cell RT-PCR analysis. cDNA was synthesized from single-sortedcells using the Maxima First Strand cDNA Synthesis Kit (Fermentas,Waltham, Mass.) followed by amplification of the immunoglobulin variablegenes using primer sets and PCR conditions previously described (Smithet al., 2009). Variable genes were determined with in-house analysissoftware using the IMGT search engine (Brochet et al., 2008).

Memory B cells frozen for high-throughput VH:VL pairing were thawed andrecovered by centrifugation at 250 g for 10 min. Cells were resuspendedin 200 μl RPMI-1640 supplemented with 1×GlutaMAX, 1×non-essential aminoacids, 1×sodium pyruvate and 1×penicillin/streptomycin (LifeTechnologies) and incubated at 37° C. for 13 h in a 96-well plate.Recovered cells were centrifuged again at 250 g for 10 min andresuspended in 400 μl PBS, and 6 μl were withdrawn for cell countingwith a hemocytometer. Approximately 8,800 cells were recovered fromfrozen stock. Memory B cells were then spiked with ˜880 IM-9 cells (ATCCnumber CCL-159) as an internal control. Cells were resuspended over twoPDMS microwell slides (340,000 wells) and allowed to settle into wellsby gravity over the course of 5 min with gentle agitation. The cellseeding process has been calculated to be 90% efficient by measuringcell concentration in seeding buffers both pre- and post-cell seeding;thus 8,000 primary cells were analyzed in this experiment. The fractionof cells isolated in the single and multiple cell per well states wascalculated using Poisson statistics:

${P( {k,\mu} )} = \frac{\mu^{k}e^{- \mu}}{k!}$

where k equals the number of cells in a single microwell and μ is theaverage number of cells per well, so that the 1:39 cell:well ratio usedin this experiment corresponds to 98.7% of cells deposited at anoccupancy of one cell/well. 25 μl of poly(dT) magnetic beads (InvitrogenmRNA Direct Kit) were resuspended in 50 μl PBS and distributed over eachPDMS slide surface, (mean of 55 poly(dT) beads per well). Magnetic beadswere allowed to settle into wells by gravity for ˜5 min, then aBSA-blocked dialysis membrane (12,000-14,000 MWCO regenerated cellulose,25-mm flat width, Fisher Scientific) that had been rinsed in PBS waslaid over each slide surface, sealing the microwells and trapped cellsand beads inside (FIG. 1). Excess PBS was removed from the slide andmembrane surfaces using a 200 μL pipette. 500 μL of cell lysis solution(500 mM LiCl in 100 mM TRIS buffer (pH 7.5) with 1% lithium dodecylsulfate, 10 mM EDTA and 5 mM DTT) was applied to the dialysis membranesfor 20 min at room temperature. Time-lapse microscopy revealed that allcells are fully lysed within 1 min (FIG. 2). Subsequently, the slideswere incubated at 4° C. for 10 min at which point a Dynal MPC-S magnetwas placed underneath the PDMS microwell device to hold magnetic beadsinside the microwells as the dialysis membrane was removed with forcepsand discarded. The PDMS slides were quickly inverted in a Petri dishcontaining 2 mL of cold lysis solution and the magnet was appliedunderneath the Petri dish to force the beads out of the microwells.Subsequently, 1 ml aliquots of the lysis solution containing resuspendedbeads were placed into Eppendorf tubes and beads were pelleted on aDynal MPC-S magnetic rack and washed once without resuspension using 1mL per tube of wash buffer 1 (100 mM Tris, pH 7.5, 500 mM LiCl, 1 mMEDTA, 4° C.). Beads were resuspended in wash buffer 1, pelleted andresuspended in wash buffer 2 (20 mM Tris, pH 7.5, 50 mM KCl, 3 mM MgCl)and pelleted again. Finally beads were suspended in 2.85 mL cold RT-PCRmixture (Quanta OneStep Fast, VWR) containing 0.05 wt % BSA (InvitrogenUltrapure BSA, 50 mg/mL) and primer sets for VH and VL linkageamplification (FIG. 4 and Tables 6 and 10). The suspension containingthe poly(dT) magnetic beads was added dropwise to a stirring IKAdispersing tube (DT-20, VWR) containing 9 mL chilled oil phase(molecular biology grade mineral oil with 4.5% Span-80, 0.4% Tween 80,0.05% Triton X-100, v/v %, Sigma-Aldrich, St. Louis, Mo.), and themixture was agitated for 5 min at low speed. The resulting emulsion wasadded to 96-well PCR plates with 100 μL emulsion per well and placed ina thermocycler. The RT step was performed under the followingconditions: 30 min at 55° C., followed by 2 min at 94° C. PCRamplification was performed under the following conditions: four cyclesof 94° C. for 30 s denature, 50° C. for 30 s anneal, 72° C. for 2 minextend; four cycles of 94° C. for 30 s denature, 55° C. for 30 s anneal,72° C. for 2 min extend; 22 cycles of 94° C. for 30 s denature, 60° C.for 30 s anneal, 72° C. for 2 min extend; then a final extension stepfor 7 min at 72° C. After thermal cycling the emulsion was visuallyinspected to ensure the absence of a bulk water phase, which is a keyindicator of emulsion stability. Following visual verification, theemulsion was collected and centrifuged at room temperature for 10 min at16,000 g, the mineral oil upper phase was discarded, and 1.5 mL diethylether was added to extract the remaining oil phase and break theemulsion. The upper ether layer was discarded, two more etherextractions were performed and residual ether was removed in a SpeedVacfor 25 min at room temperature. The aqueous phase was diluted 5:1 in DNAbinding buffer and passed through a silica spin column (DNA Clean &Concentrator, Zymo Research, Irvine, Calif.) to capture the cDNAproduct. The column was washed twice with 300 μL wash buffer (ZymoResearch Corp) and cDNA was eluted into 40 μL nuclease-free water.Finally, a nested PCR amplification was performed (ThermoPol PCR bufferwith Taq Polymerase, New England Biosciences, Ipswich, Mass.) in a totalvolume of 200 μL using 4 μL of eluted cDNA as template with 400 nMprimers (Tables 8 and 11) under the following conditions: 2 min initialdenaturation at 94° C., denaturation at 94° C. for 30 s for 39 cycles,annealing at 62° C. for 30 s and extension at 72° C. for 20 s, finalextension at 72° C. for 7 min. The approximately 850 bp linked product(FIG. 3) was extracted by agarose gel electrophoresis and sequencedusing the 2×250 paired end MiSeq NextGen platform (Illumina, San Diego,Calif.).

TABLE 10 Primer sets for human VH and VL linkage RT-PCR amplification.Conc. (nM) Primer ID 40 hIgA-rev-OE- AHX89 40 hIgM-rev-OE- AHX89

TABLE 11 Primer sets for human VH and VL nested PCR amplification. Conc.(nM) Primer ID 400 hIgA-all-rev- OEnested 400 hIgM-rev- OEnested

For bioinformatic analysis, raw 2×250 MiSeq data were filtered forminimum Phred quality score of 20 over 50% of nucleotides to ensure highread quality in the CDR3-containing region (approximately HC nt 65-115or LC nt 55-100). Sequence data were submitted to the InternationalImMunoGeneTics Information System (IMGT) for mapping to germline V(D)Jgenes (Brochet et al., 2008). Sequence data were filtered for in-frameV(D)J junctions, and productive VH and Vκ,λ, sequences were paired byIllumina read ID. CDR-H3 nucleotide sequences were extracted andclustered to 96% nt identity with terminal gaps ignored, to generate alist of unique CDR-H3s in the data set. 96% nt identity cutoff was foundto be the optimal cutoff to cluster sequencing error in spiked controlclones; the number of unique CDR-H3 sequences and hence the number ofunique V genes reported refer to the number of clusters recovered fromthe sample (Table 12). The top read-count CDR-L3 for each CDR-H3 clusterwas assigned as a cognate pair and a list of recovered VH:VL pairs wasgenerated. The observed accuracy ratio of 942:1 demonstrated thepreservation of correct heavy and light chain pairings in the IM-9spiked control cell line (Table 12).

TABLE 12 Key experimental statistics for Example 4. ImmunizationInfluenza (2010-11 Fluvirin) Cell Type Day 14 memory B cells Fresh Cellsvs. Freeze/Thaw Freeze/Thaw Cell:Well Ratio  1:39 % cells as singlecells 98.7% Unique CDR-H3 Recovered 240 Control Cell Spike IM-9 AccuracyRatio¹ 942:1 ¹For known spiked cells, (reads correct VL):(reads topincorrect VL)

The VH:VL pairings identified using this high-throughput approach towere compared those identified using the established single-cell sortingmethod (Smith et al., 2009; Wrammert et al., 2008); this analysis wasconducted in a double-blinded manner. PeripheralCD19⁺CD3⁻CD27⁺CD38^(int) memory B cells were isolated from a healthyvolunteer 14 d after vaccination with the 2010-2011 trivalent FluVirininfluenza vaccine (Smith et al., 2009). For the scRT-PCR analysis, 168single B cells were sorted into four 96-well plates, and 168 RT and 504nested PCR reactions were carried out individually to separately amplifythe VH and VL (κ and λ) genes. DNA products were resolved by gelelectrophoresis and sequenced to yield a total of 51 VH:VL pairs, ofwhich 50 were unique. A total of 240 unique CDR-H3:CDR-L3 pairs wererecovered. Four CDR-H3 sequences detected in the high-throughput pairingset were also observed in the single-cell RT-PCR analysis. A blindedanalysis revealed that CDR-H3:CDR-L3 pairs isolated by the twoapproaches were in complete agreement (DeKosky et al., 2013). Theagreement between established single-cell RT-PCR sequencing methods andthe high-throughput sequencing methods demonstrated high accuracy inVH:VL sequences recovered according to the methods described in thepresent disclosure.

Example 5—Isolation of High Affinity Antibodies FollowingHigh-Throughput VH:VL Pairing

This example describes the isolation of high affinity anti-tetanusantibodies from human peripheral B cells following booster immunization.One female donor was booster immunized against TT/diphtheria toxoid (TD,20 I.E. TT and 2 I.E. diphtheria toxoid, Sanofi Pasteur Merck Sharpe &Dohme GmbH, Leimen, Germany) after informed consent by the CharitéUniversitätsmedizin Berlin had been obtained (samples were anonymouslycoded and study approved by the hospital's ethical approval board,number EA1/178/11, and the University of Texas at Austin InstitutionalReview Board, IRB# 2011-11-0095). At 7 d post TT immunization, EDTAblood was withdrawn and PBMC isolated by density gradient separation asdescribed (Mei et al., 2009). PBMCs were stained in PBS/BSA at 4° C. for15 min with anti-human CD3/CD14-PacB (clones UCHT1 and M5E2,respectively, Becton Dickinson, BD), CD19-PECy7 (clone SJ25C1, BD),CD27-Cy5 (clone 2E4, kind gift from René van Lier, Academic MedicalCentre, University of Amsterdam, The Netherlands, labeled at theDeutsches Rheumaforschungszentrum (DRFZ), Berlin), CD2O-Pac0 (cloneHI47, Invitrogen), IgD-PerCpCy5.5 (clone L27, BD), CD38-PE (clone HIT2,BD) and TT-Digoxigenin (labeled at the DRFZ) for 15 min at 4° C. Cellswere washed and a second staining was performed withanti-Digoxigenin-FITC (Roche, labeled at the DRFZ) and DAPI was addedbefore sorting. CD19⁺CD3⁻CD14⁻CD38⁺⁺CD27⁺⁺CD20⁻TT⁺ plasmablasts weresorted using a FACSAria II sorter system (BD Biosciences). A portion ofsorted cells were washed and cryopreserved in DMSO/10%FCS forhigh-throughput VH:VL pairing.

One vial containing approximately 2,000 frozen TT⁺ plasmablasts wasthawed and recovered by centrifugation at 250×g for 10 min;approximately 20-30% of the cells are anticipated to be viable (Kyu etal., 2009). Cells were resuspended in 300 μL RPMI-1640 supplemented with10% FBS, 1×GlutaMAX, 1×non-essential amino acids, 1×sodium pyruvate and1×penicillin/streptomycin (all from Life Technologies) and incubated at37° C. for 13 h in a 96-well plate. Recovered cells were centrifugedagain at 250×g for 10 min and resuspended in 400 μL PBS, and 6 μL werewithdrawn for cell counting with a hemocytometer. Cells were spiked withapproximately 30 ARH-77 cells as an internal control (ATCC numberCRL-1621) and VH:VL transcripts were linked as described in Example 4,omitting IgM primers and using a 38-cycle nested PCR; the resultingproduct was submitted for 2×250 MiSeq sequencing. VH and VL chains werealso amplified individually to obtain full VH and VL sequences forantibody expression. Nested PCR product was diluted 1:9 and 0.5 μL wereused as template in a PCR reaction with the following conditions: 400 nMprimers (Tables 8, 11 and 13), 2 min initial denaturation at 94° C.,denaturation at 94° C. for 30 s for 12 cycles, annealing at 62° C. for30 s and extension at 72° C. for 15 s, final extension at 72° C. for 7min. The resulting ˜450 bp VH or ˜400 bp VL products were purified byagarose gel electrophoresis and submitted for 2×250 MiSeq sequencing.Sequence data was processed as described above; additionally ten VH andVL pairs were selected from TT+ plasmablast pairings for antibodyexpression and testing. For complete antibody sequencing of these tengenes, 2×250 bp reads containing the 5′ V gene FR1-CDR2 and 3′ CDR2-FR4were paired by Illumina read ID and consensus sequences were constructedfrom reads containing the exact CDR3 of interest. Antibody genes werethen cloned into the human IgG expression vectors pMAZ-VH and pMAZ-VL,respectively (Mazor et al., 2007). 40 μg each of circularized ligationproduct were co-transfected into HEK293F cells (Invitrogen, N.Y., USA).Medium was harvested 6 d after transfection by centrifugation and IgGwas purified by a protein-A agarose (Pierce, Ill., USA) chromatographycolumn.

TABLE 13 Linkers for VH and VL separate amplification primers. Conc.(nM) Primer ID 400 Linker-VHfwd 400 Linker-VLfwd

Antigen affinities were determined by competitive ELISA (Friguet et al.,1985) using different concentrations of IgG in a serial dilution ofantigen, ranging from 100 nM to 0.05 nM in the presence of 1% milk inPBS. Plates were coated overnight at 4° C. with 10 μg/mL of TT in 50 mMcarbonate buffer, pH 9.6, washed three times in PBST (PBS with 0.1%Tween 20) and blocked with 2% milk in PBS for 2 h at room temperature.Pre-equilibrated samples of IgG with TT antigen were added to theblocked ELISA plate, incubated for 1 h at room temperature, and plateswere washed 3×with PBST and incubated with 50 pi of anti-human kappalight chain-HRP secondary antibody (1:5,000, 2% milk in PBS) for ˜2 min,25° C. Plates were washed 3×with PBST, then 50 μl Ultra TMB substrate(Thermo Scientific, Rockford, Ill.) was added to each well and incubatedat 25° C. for 5 min. Reactions were stopped using equal volume of 1MH₂SO₄ and absorbance was read at 450 nm (BioTek, Winooski, Vt.). Eachcompetitive ELISA replicate was fit using a four-parameter logistic(4PL) equation, with error represented as the s.d. of 2-3 replicates foreach IgG analyzed. All ten antibodies showed specificity for TT andbound TT with high affinity (0.1 nM≦KD≦18 nM; Table 14) (DeKosky et al.,2013). The high affinity of anti-TT antibodies recovered demonstratesthe application of high-throughput VH:VL sequencing methods in thepresent disclosure for antibody discovery from human cell donors.

TABLE 14 Tetanus toxoid-binding affinities of IgG isolated byhigh-throughput sequencing of VH:VL pairs. Affinities were calculatedfrom competitive ELISA dilution curves. Antibody ID Gene FamilyAssignment¹ Affinity (K_(D)) TT1 HV3-HD1-HJ6:KV3-KJ5  1.6 ± 0.1 nM TT2HV3-HD3-HJ4:LV3-LJ1   14 ± 3 nM TT3 HV1-HD2-HJ4:KV3-KJ5  3.6 ± 1.8 nMTT4 HV2-HD2-HJ4:KV1-KJ1  2.7 ± 0.3 nM TT5 HV4-HD2-HJ6:KV2-KJ3   18 ± 4nM TT6 HV1-HD3-HJ4:KV1-KJ2 0.57 ± 0.03 nM TT7 HV4-HD3-HJ4:KV1-KJ2 0.46 ±0.01 nM TT8 HV3-HD3-HJ4:LV8-LJ3  2.8 ± 0.3 nM TT9 HV4-HD2-HJ4:KV1-KJ10.10 ± 0.01 nM TT10 HV1-HD3-HJ5:KV3-KJ5  1.6 ± 0.1 nM ¹Each heavy andlight chain was distinct.

Example 6—Bioinformatic Identification of VH:VL Sequences via MutualPairing Agreement

Examples 4 and 5 disclose the identification of correct VH:VL sequencepairs from high throughput sequencing whereby the highest read-count VLsequence for a given VH sequence revealed the native cognate VH:VL pairsencoded by individual B cells. Alternatively, this example describes amethod to identify correct VH:VL pairs in high-throughput VH:VL amplicondata via consensus pairing of both VH and VL sequences.

Raw data pairings are collected and the highest frequency VL for each VHsequence were tabulated into File 1. The top VH for every VL weretabulated separately into

File 2. Many computational techniques can be used to accomplish thetabulation step; for example “grep—m 1 CDR3 filename” in Bash/Linuxshell can select the top-ranked cognate pair for a CDR-H3 or CDR-L3sequence (CDR3) from a file (filename) containing raw pairing data thathas been pre-sorted to contain sequences ordered by descending readcounts. Other solutions for data tabulation include the use of a hash tocollect sequences and sequence read counts (e.g. Perl computinglanguage), or the use of a dictionary to collect sequences and readcounts (e.g. Python) or other data storage structures (e.g. associativememories or associative arrays). File 1 and File 2 were compared and anyVH:VL pairs appearing in both files showed “consensus” in that the pairdescribed by the top-ranked VL for a given VH agreed with the top-rankedVH for a given VL. Many computational techniques can be applied toaccomplish file comparisons; one solution for file comparison uses the“join” command in Bash/Linux where lines containing desired fields thatmatch across documents are printed to standard output. The algorithmdescribed in the present example was effective at both identifyingcorrect VH:VL pairs and at reducing minor sequence errors because VH:VLpairs containing sequence errors are often filtered out by mutualagreement criteria. A general decision tree of the algorithm used forpairing is provided as FIG. 9.

Example 7—VH:VL Pairing of Expanded Memory B Cells

Memory B cells were isolated and expanded in vitro, and two aliquots ofthe expanded cells were processed for high-throughput pairing. In vitroclonal expansion results in multiple copies of cells containing the sameVH:VL pairs, thus increasing the probability of sequencing the sameVH:VL pair in separate aliquots derived from the same B cell sample.

PBMC were isolated from donated human blood and stained with CD20-FITC(clone 2H7, BD Biosciences, Franklin Lakes, N.J., USA), CD3-PerCP(HIT3a, BioLegend, San Diego, Calif., USA), CD19-v450 (HIB19, BD), andCD27-APC (M-T271, BD). CD3⁻ CD19⁺CD20⁺CD27⁺ memory B cells wereincubated four days in the presence of RPMI-1640 supplemented with 10%FBS, 1×GlutaMAX, 1×non-essential amino acids, 1×sodium pyruvate and1×penicillin/streptomycin (all from Life Technologies) along with 10μg/mL anti-CD40 antibody (5C3, BioLegend), 1 μg/mL cPg ODN 2006(Invivogen, San Diego, Calif., USA), 100 units/mL IL-4, 100 units/mLIL-10, and 50 ng/mL IL-21 (PeproTech, Rocky Hill, N.J., USA). 91,000expanded B cells were seeded over 12 chips, and after a 90% estimatedwell seeding efficiency ratio approximately 41,000 expanded B cells wereanalyzed per group (1:25 cell:well ratio) according to the methodsdescribed in Example 4. Bioinformatic analysis was performed asdescribed in Example 6. 1,033 CDR-H3 sequences with ≧1 read weresequenced in both groups, and 972/1,033 displayed matching CDR-L3 pairsto yield a 94.09% matching fraction. Pairing accuracy, A_(P) can beestimated from the CDR-L3 matching fraction, f_(match), of the twoindependent groups:

f _(match) =A _(P,Group1) ×A _(P,Group2) =A _(P) ²

A_(P)=f_(match) ^(1/2)

which yielded an overall accuracy of 97.0%. The theoretical limit ofaccuracy from the rate of single cells per well by Poisson distribution(98% for the 1:25 cell:well ratio utilized in this experiment)correlated very closely with experimentally determined accuracy of VH:VLpairings.

Example 8—The Use of Leader Peptide Primers for VH:VL Pairing

In this example, primers which anneal to the leader peptide region ofantibody cDNAs (as opposed to primers specific for the framework 1 ofthe VH and VL domains, disclosed in Example 4) were used to sequenceantibody VH:VL pairs. Memory B cells were isolated from donated humanPBMC, and cells were split in two groups: Group 1 consisted of 29,000cells and was analyzed immediately (using a total of 510,000 wells, 1:16cell:well ratio), while Group 2 was expanded as described in Example 7and 28,000 cells were analyzed after in vitro expansion (using a totalof 680,000 wells, 1:24 cell:well ratio). Both experiments were conductedas described in Example 7 using leader peptide overlap extension primersreported in Table 15 and emulsion linkage RT-PCR cycling with thefollowing conditions: 30 min at 55° C., followed by 2 min at 94° C.;four cycles of 94° C. for 30 s denature, 54° C. for 30 s anneal, 72° C.for 2 min extend; 29 cycles of 94° C. for 30 s denature, 60° C. for 30 sanneal, 72° C. for 2 min extend; then a final extension step for 7 minat 72 ° C. An additional barcoded region was also included in the VLlinkage primers (16N region) which was used to identify multiplesequence reads of individual linkage events (Table 15). Nested PCR wasperformed as in Example 5, with 25 PCR cycles for each group.

TABLE 15 Overlap extension RT-PCR primers targeting the leader peptideregion of antibody mRNA. Conc. (nM) Primer ID 400 CHrev-AHX89 400CLrev-BRH06 40 hIgG-rev-OE- AHX89 40 hIgA-rev-OE- AHX89 40 hIgM-rev-OE-AHX89 40 hIgKC-rev-OE- BRH06 40 hIgLC-rev-OE- BRH06 40 VH1_L 40 VH3_L 40VH4/6_L 40 VH5_L 40 hVλ1for_L 40 hVλ2for_L 40 hVλ3for_L 40 hVλ3for-2_L40 hVλ3for-3_L 40 hVλ4/5for_L 40 hVλ6for_L 40 hVλ7for_L 40 hVλ8for_L 40hVκ1/2for_L 40 hVκ3for_L 40 hVκ4for_L

After high-throughput Illumina 2×250 bp sequencing of nested PCRproducts, 23/23 CDR-H3 observed with ≧2 reads in both leader peptidegroups displayed matching CDR-L3. This example demonstrates that variousprimer sets can be used to sequence multiple transcripts using themethods in the present disclosure.

Example 9—Low Dispersity, Single Cell Water-in-Oil Droplet FormationUsing a Nozzle and Annular Carrier Stream

In this example, the immortalized B cell lines MOPC-21 were viablyencapsulated in emulsion droplets of controlled size consisting of amixture of cells in PBS and Trypan blue stain for cell viabilityvisualization. This example demonstrates the isolation of single cellsinto emulsion droplets of controlled size distribution, furthermore thedroplets being comprised of two different aqueous streams which miximmediately prior to droplet formation (FIG. 7).

MOPC-21 cells were resuspended at a concentration of 500,000 cells/mL ofPBS. A coaxial emulsification apparatus was constructed by inserting a26-gauge needle (Hamilton Company, Reno, Nev., USA) within 19-gaugehypodermic tubing (Hamilton) and the needle was adjusted so that theneedle tip was flush with the end of the hypodermic tubing. Theconcentric needles were placed inside ⅜ inch OD glass tubing (WaleApparatus, Hellertown, Pa., USA) with a 140 μm orifice such that theneedle exit is approximately 2 mm from the nozzle orifice. The aqueousPBS/cell solution was injected through the needle at a rate of 500μL/min, while a PBS/0.4% Trypan blue solution (Sigma-Aldrich, St. Louis,Mo., USA) was injected through the 19 ga hypodermic tubing, and an oilphase (molecular biology grade mineral oil with 4.5% Span-80, 0.4% Tween80, 0.05% Triton X-100, v/v %, Sigma Aldrich Corp.) was passed throughthe glass tubing at a rate of 3 mL/min. Droplets suspended in oil phasewere collected into a 2 mL Eppendorf tube. A syringe pump (KD ScientificLegato 200, Holliston, Mass., USA) was used to control aqueous flowrates and a gear pump (M-50, Valco Instruments, Houston, Tex., USA) wasused to control oil flow rates, and the resulting emulsions wereanalyzed via light microscopy. Droplets with a mean diameter ofapproximately 85 μm were generated and encapsulated single cellsdisplayed high viability as measured by exclusion of trypan blue (FIG.5).

Example 10—Sequencing Multiple Transcripts in B Cells via Encapsulationin Emulsion Droplets

In this example the cell lysis and mRNA annealing to poly(T) beads wasaccomplished within an emulsion generated using the method outlined inExample 9. A population of memory B cells was isolated and the cellsexpanded as in Example 7. Memory B cells were resuspended in PBS at aconcentration of 100 k/mL and passed through the innermost, 26-gaugeneedle of the emulsion generator device of Example 8 at a rate of 500μL/min. 450 poly(dT) magnetic beads (1.0 μm diameter, New EnglandBiosciences, Ipswich, Mass., USA) were pelleted with a magnet andresuspended in 5 mL of cell lysis/binding buffer (100 mM tris pH 7.5,500 mM LiCl, 10 mM EDTA, 0.5% lithium dodecyl sulfate, 5 mM DTT), andthe resulting mixture was passed through the 19-gauge hypodermic tubingat a rate of 500 μL/min, while oil phase (molecular biology grademineral oil with 4.5% Span-80, 0.4% Tween 80, 0.05% Triton X-100, v/v %,Sigma Aldrich Corp.) was passed through the outermost glass tubing at arate of 3 mL/min to generate an emulsion consisting of aqueous dropletsof approximately 85 μm diameter containing single cells. The emulsionstream was collected into 2 mL Eppendorf tubes, and cells were lysed bydetergent as droplets were generated to allow for mRNA capture ontopoly(dT) magnetic beads encapsulated within the emulsion droplets.

Each 2 mL emulsion tube was maintained at room temperature for threeminutes before being placed on ice for a minimum of ten minutes. Thenthe tubes were centrifuged at 16,000×g for 5 minutes at 4° C., and theupper mineral oil layer was removed and discarded.

200 μL of cold diethyl ether was added to chemically break the emulsionand the tubes were centrifuged at 16,000×g for 2.5 minutes to pelletmagnetic beads. Magnetic beads were withdrawn using a pipette, pelleted,and resuspended in 2 mL lysis/binding buffer (100 mM tris pH 7.5, 500 mMLiCl, 10 mM EDTA, 0.5% LiDS, 5 mM DTT). Beads were then washed andresuspended in OE RT-PCR mixture as in Example 8. Leader peptide primerswere used, primer concentrations are given in Table 15. The OE RT-PCRmixture bead suspension was emulsified and thermally cycled, cDNA wasextracted, and a nested PCR was performed (see Example 8). Nested PCRproduct was electrophoresed to purify linked transcripts, which werethen sequenced as in Example 8 above.

After high-throughput Illumina 2×250 bp sequencing of nested PCRproducts, 14,121 VH:VL pairs with ≧2 reads were recovered according tothe algorithm described in Example 6 (7,367 VH:VL pairs in Group 1, and6,754 pairs in Group 2). 3,935 CDR-H3 were observed with ≧2 reads inboth groups. 3,899/3,935 of CDR-H3 observed in both groups displayedmatching CDR-L3, indicating 99.5% overall accuracy according to theformula outlined in Example 7. The present example demonstrates thesequencing of multiple transcripts via mRNA capture from single cellsisolated within emulsion droplets.

Example 11—Parallel Sequencing of Heavy and Light Chain cDNAs fromSingle Cells

Previous examples demonstrated the use of magnetic beads to capture mRNAand covalent linkage of desired cDNAs from a single cell (e.g., VH andVL cDNAs) to create a single amplicon. The single VH-VL amplicons thusgenerated were sequenced by high throughput DNA sequencing to reveal therepertoire of naturally paired VH and VL sequences.

In the example, the cDNAs captured onto beads were sequenced directlywithout linking (i.e. without creating a linked VH-VL amplicon). In thismanner, the identity of the desired transcripts from a single cell wasrevealed without the need for overlap extension PCR. First, an equalmixture of three 5′-amine functionalized primers (Table 17) wasconjugated to functionalized magnetic beads so that the immobilizedoligonucleotides on each magnetic bead were in the following proportion:⅓ poly(T) for mRNA capture, ⅓ primer specific for desired transcript 1(e.g., the AHX89 primer of Table 1,), and 1/3 primer specific fordesired transcript 2 (Table 17). These primer-conjugated magnetic beadsserved a dual purpose: first, upon lysis, poly(T) primers captured heavyand light chain mRNA from individual cells, as in Examples 4-6; second,in the emulsion RT-PCR step, AHX89 and BRHO6 primers caused heavy andlight chain cDNA to amplify on the bead surface. After RT-PCR, magneticbeads were used as sequencing template for high-throughput sequencing.The process is outlined in FIG. 7.

An equal mixture of three 5′-amine oligonucleotides (Table 16) wasimmobilized to functionalized magnetic beads according to manufacturerprotocols (Dynal MyOne Carboxylic Acid beads, 1.0 μm diameter,Invitrogen Corp.). Then, a mixture of MOPC-21 and MOPC-315 immortalizedcells were washed and suspended at 100,000 cells/mL in PBS (pH 7.4).1.2×10⁸ functionalized magnetic beads were added per mL of celllysis/mRNA binding solution, as outlined in Example 10. The cell/beadsuspension was emulsified as in Example 10, cells are lysed and mRNAanneals to beads. Then beads were recovered by breaking the emulsion,washed as described in Example 10, and emulsion RT-PCR was performed.RT-PCR primer concentrations are given in Table 17. Cycling conditionswere as follows: 30 min at 55° C., followed by 2 min at 94° C.; fourcycles of 94° C. for 30 s denature, 57° C. for 1 min anneal, 72° C. for2 min extend; 29 cycles of 94° C. for 30 s denature, 59° C. for 30 sanneal, 72° C. for 2 min extend; then a final extension step for 7 minat 72° C.

TABLE 16 Primers conjugated to the magnetic bead surface. Conc. PrimerID 33% oligodT(25)- 5′amine 33% CHrev-AHX89- 5′amine 33% CLrev-BRH06-5′amine

TABLE 17 Primers in the MOPC-21/MOPC-315 RT-PCR mix. Conc. Primer ID 400CHrev-AHX89 400 CLrev-BRH06 40 MOPC21-CHrev-AHX89 40 MOPC21-CLrev-BRH0640 MOPC315-CLrev-BRH06 40 MOPC315-CHrev-AHX89 400 MOPC21-VH-OE-5′<F3>400 MOPC21-VL-OE-5′<F5> 400 MOPC315-VH-OE-5′<F3> 400MOPC315-VL-OE-5′<F5>

After emulsion RT-PCR, the emulsion was broken with n-butanol accordingto SOLiD gene sequencing manufacturer protocols (Applied Biosystems),and magnetic beads were submitted as direct template for the Ion Torrentsequencing platform (Life Technologies). Sequencing was initiated firstwith the <F3> heavy chain primer to collect heavy chain cDNA sequences,followed by sequencing with the <F5> light chain primer to collect lightchain cDNA sequences. The heavy and light chain sequences were matchedby location on the Ion Torrent sequencing platform to obtain the nativeheavy and light chain pairings.

Example 12—Sequencing of Paired VH:VL Transcripts from Cells EncodingHigh-Affinity Antibodies

Previous examples detailed the use of various techniques for sequencingmultiple transcripts from a variety of cell populations. The presentexample describes a method for high-throughput sequencing nativelypaired VH:VL antibody sequences from only cells encoding high affinityantibodies specific to a particular antigen of interest usingantigen-dependent poly(dT) capture and subsequent VH:VL sequencing.

Antigen-coated magnetic beads were prepared by covalently coupling freevaccine-grade tetanus toxoid (TT) (1 mM oligonucleotide, 40 nM TT,Statens Serum Institut, Copenhagen, Denmark) to carboxylicacid-functionalized magnetic beads (1 μm diameter Dynal MyOne COOHbeads, Life Technologies) according to manufacturer protocols.

PBMC were collected from donated blood 14 d after administration oftetanus toxoid (TT)/diphteria toxoid boost vaccination (TD; 20 I.E. TTand 2 I.E. diphteria toxoid, Sanofi Pasteur MSD GmbH, Leimen, Germany)and sorted via labeled antibody staining and FACS sorting, as in Example7. Memory B cells were seeded into sterile PDMS slides as described inExample 4 along with antigen-coated beads (approximately 40 beads/well),and cells were sealed inside the wells using a dialysis membrane andcultured inside the PDMS microwell slides for four days in memory B cellstimulation media: RPMI-1640 supplemented with 10%immunoglobulin-depleted FBS, 1×GlutaMAX, 1×non-essential amino acids,1×sodium pyruvate and 1×penicillin/streptomycin (all from LifeTechnologies) along with 10 μg/mL anti-CD40 antibody (5C3, BioLegend),500 U/ml IL-4, and 5 ng/ml IL-5 (PeproTech, Rocky Hill, N.J., USA).During this time, the cells were stimulated to secrete antibody(Taubenheim et al., 2012), and any secreted antibody specific to TTbecame bound to magnetic microbeads containing immobilized antigen.

A solution of 5′ streptavidin-labeled poly(dT)₂₅ oligonuclotides(Integrated DNA Technologies, USA) was mixed in an equimolar ratio withgoat anti-human IgG-biotin conjugate (B1140, Sigma-Alrich, USA). Thestreptavidin and biotin associated in solution to form anti-IgGantibodies with tethered poly(dT)₂₅ oligonucleotides for mRNA capture.After four days in culture, the seal was broken and the slide surfacewas washed gently with 400 μL PBS three times to wash away secretedantibodies without disturbing cells and beads inside wells. Excess PBSwas removed and 350 μL of RPMI-1640 media containing 10 nM anti-IgGantibody/poly(dT)₂₅ conjugate was added to the microwell slide surfaceand the slide was incubated at room temp for 45 minutes. Over the courseof the 45 min incubation, any antigen-labeled microbeads which had beencoated by anti-TT antibodies following the 4-day secretion phase (ieantigen-labeled microbeads co-localized in a well with a secreting cellthat encoded a specific antibody for TT) became decorated withpoly(dT)₂₅ for mRNA capture. Subsequently the slides were gently washedthree times with 400 uL PBS to remove excess antibody/oligonucleotideconjugate and microwells were sealed with a dialysis membrance, cellswere lysed, beads were recovered with a magnet, and emulsion linkageRT-PCR was performed as in NEW Example 3, with the exception that 0.1%lithium dodecyl sulfate was used in the cell lysis buffer instead of 1%lithium dodecyl sulfate. Nested PCR was performed and linked transcriptswere sequenced using a long-read Next Generation sequencing platform, asin NEW Example 5.

The process outlined in the present method enriched the sequence set forhigh-affinity antigen-specific VH:VL pairs, as only the antigen-labeledbeads with bound IgG immunoglobulin contained the poly(dT)₂₅ sequencerequired for mRNA capture after cell lysis. Thus, the method outlined inthis example demonstrates the application of the high-throughput VH:VLpairings technique for sequencing of a large number of antigen-specificVH:VL pairs in a single experiment without the need for surfaceexpression of immunoglobulin.

Example 13—RT-PCR on Single Cells Emulsified Using a Low DispersityDroplet Emulsion

As in Example 6, an emulsion was formed by injecting aqueous stream outof a nozzle into a fast-moving annular oil phase. Shear forces generatedby the carrier stream induced aqueous droplet formation with a tightlycontrolled size distribution, and the nozzle/carrier stream methodgenerated emulsions of monodisperse droplet sizes which reduces theincidence of multiple cells per emulsion droplet caused by a range ofdroplet sizes. In this example, a mixture of two immortalized cell lines(MOPC-21 andMOPC-315) was used to demonstrate cell encapsulation andlinkage RT-PCR directly in emulsion droplets of approximately 4 nLvolume without intermediate cell lysis or mRNA capture steps.

An equal mix of RNAse-treated and washed MOPC-21 and MOPC-315 cells (asin Example 2) were resuspended at a concentration of 50,000 totalcells/mL in PBS, while another aqueous phase was prepared consisting of2× concentrated RT-PCR mixture (Quanta OneStep Fast qRT-PCR) with 0.1%BSA (Invitrogen Ultrapure BSA, 50 mg/mL), 4% SuperAse In RNAse inhibitor(Invitrogen, USA), and 0.1% NP-40 detergent. An emulsification apparatuswas prepared as in Example 10. All needles and needle supply tubes werepre-blocked in 1% BSA for 30 minutes and rinsed with PBS, and cells inPBS were delivered through the inner (26 gauge) needle while RT-PCRmixture and detergent was delivered via the outer (19 gauge) needle,with both aqueous phases being 500 μL/min. Oil carrier phase (molecularbiology grade mineral oil with 4.5% Span-80, 0.4% Tween 80, 0.05% TritonX-100, v/v %, oil phase reagents from Sigma Aldrich Corp.) flowedthrough the outer glass tubing at a rate of 3 mL/min and samples werecollected as in Example 10. A total of 2 mL of the cell/RT-PCR mixturemixed with 2 mL of NP-40 diluent was emulsified for approximately100,000 cells analyzed. Primer concentrations for the RT-PCR mixture aregiven in Table 1, with the same thermal cycling conditions being used asthose in Example 11.

The cell emulsion for RT-PCR was then placed into 96-well plates andthermally cycled, cDNA was extracted, and a nested PCR reaction wasperformed (see Example 4). Nested PCR primers are given in Table 2, andthermal cycling conditions for the PCR were as follows: a 2 mindenaturing step at 94° C., followed by thermal cycling at 94° C. for 30s denature, 62° C. for 30 s anneal, 72° C. for 20 s extend, for 30cycles. Nested PCR product was electrophoresed to purify linked VH-VLcDNA, which was submitted as template for NextGen sequencing.

Therefore, the present invention is well adapted to attain the ends andadvantages mentioned as well as those that are inherent therein. Theparticular embodiments disclosed above are illustrative only, as thepresent invention may be modified and practiced in different butequivalent manners apparent to those skilled in the art having thebenefit of the teachings herein. Furthermore, no limitations areintended to the details of construction or design herein shown, otherthan as described in the claims below. It is therefore evident that theparticular illustrative embodiments disclosed above may be altered ormodified and all such variations are considered within the scope andspirit of the present invention. While compositions and methods aredescribed in terms of “comprising,” “containing,” or “including” variouscomponents or steps, the compositions and methods can also “consistessentially of” or “consist of” the various components and steps. Allnumbers and ranges disclosed above may vary by some amount. Whenever anumerical range with a lower limit and an upper limit is disclosed, anynumber and any included range falling within the range is specificallydisclosed. In particular, every range of values (of the form, “fromabout a to about b,” or, equivalently, “from approximately a to b,” or,equivalently, “from approximately a-b”) disclosed herein is to beunderstood to set forth every number and range encompassed within thebroader range of values. Also, the terms in the claims have their plain,ordinary meaning unless otherwise explicitly and clearly defined by thepatentee. Moreover, the indefinite articles “a” or “an,” as used in theclaims, are defined herein to mean one or more than one of the elementthat it introduces. If there is any conflict in the usages of a word orterm in this specification and one or more patent or other documentsthat may be incorporated herein by reference, the definitions that areconsistent with this specification should be adopted.

All of the methods disclosed and claimed herein can be made and executedwithout undue experimentation in light of the present disclosure. Whilethe compositions and methods of this invention have been described interms of preferred embodiments, it will be apparent to those of skill inthe art that variations may be applied to the methods and in the stepsor in the sequence of steps of the method described herein withoutdeparting from the concept, spirit and scope of the invention. Morespecifically, it will be apparent that certain agents which are bothchemically and physiologically related may be substituted for the agentsdescribed herein while the same or similar results would be achieved.All such similar substitutes and modifications apparent to those skilledin the art are deemed to be within the spirit, scope and concept of theinvention as defined by the appended claims.

REFERENCES

The following references, to the extent that they provide exemplaryprocedural or other details supplementary to those set forth herein, arespecifically incorporated herein by reference.

Brochet, X., Lefranc, M.-P. & Giudicelli, V. IMGT/V-QUEST: the highlycustomized and integrated system for IG and TR standardized V-J andV-D-J sequence analysis. Nucleic Acids Res. 36, W503-W508 (2008).

Chan, M. et al. Evaluation of Nanofluidics Technology forHigh-Throughput SNP Genotyping in a Clinical Setting. J Mol Diagn 13,305-312 (2011).

Citri, A. et al. Comprehensive qPCR profiling of gene expression insingle neuronal cells. Nature Protocols 7, 118-127 (2012).

DeKosky, B. J. et al. High-throughput sequencing of the paired humanimmunoglobulin heavy and light chain repertoire. Nat Biotech 31, 166-169(2013).

Friguet, B., Chaffotte, A. F., Djavadi-Ohaniance, L. & Goldberg, M. E.Measurements of the true affinity constant in solution ofantigen-antibody complexes by enzyme-linked immunosorbent assay. Journalof Immunological Methods 77, 305-319 (1985).

Kojima, T. et al. PCR amplification from single DNA molecules onmagnetic beads in emulsion: application for high-throughput screening oftranscription factor targets. Nucleic Acids Res. 33 (2005).

Krause, J. C. et al. Epitope-Specific Human Influenza AntibodyRepertoires Diversify by B Cell Intraclonal Sequence Divergence andInterclonal Convergence. The Journal of Immunology 187, 3704-3711(2011).

Kyu, S. Y. et al. Frequencies of human influenza-specific antibodysecreting cells or plasmablasts post vaccination from fresh and frozenperipheral blood mononuclear cells. Journal of Immunological Methods340, 42-47 (2009).

Mar, J. C. et al. Inferring steady state single-cell gene expressiondistributions from analysis of mesoscopic samples. Genome Biol 7 (2006).

Mary, P. et al. Analysis of gene expression at the single-cell levelusing microdroplet-based microfluidic technology. Biomicrofluidics 5(2011).

Mazor, Y., Barnea, I., Keydar, I. & Benhar, I. Antibody internalizationstudied using a novel IgG binding toxin fusion. Journal of ImmunologicalMethods 321, 41-59 (2007).

Mei, H. E. et al. Blood-borne human plasma cells in steady state arederived from mucosal immune responses. Blood 113, 2461-2469 (2009).

Meijer, P. et al. Isolation of human antibody repertoires withpreservation of the natural heavy and light chain pairing. Journal ofmolecular biology 358, 764-772 (2006).

Novak, R. et al. Single-Cell Multiplex Gene Detection and Sequencingwith Microfluidically Generated Agarose Emulsions. Angew. Chem.-Int.Edit. 50, 390-395 (2011).

Reddy, S. T. et al. Monoclonal antibodies isolated without screening byanalyzing the variable-gene repertoire of plasma cells. Naturebiotechnology 28, 965-U920 (2010).

Sanchez-Freire, V. et al. Microfluidic single-cell real-time PCR forcomparative analysis of gene expression patterns. Nat. Protocols 7,829-838 (2012).

Smith, K. et al. Rapid generation of fully human monoclonal antibodiesspecific to a vaccinating antigen. Nat. Protocols 4, 372-384 (2009).

Taubenheim, N. et al. High Rate of Antibody Secretion Is not Integral toPlasma Cell Differentiation as Revealed by XBP-1 Deficiency. The Journalof Immunology 189, 3328-3338 (2012).

Toriello, N. M. et al. Integrated microfluidic bioprocessor forsingle-cell gene expression analysis. Proc Natl Acad Sci USA 105,20173-20178 (2008).

White, A. K. et al. High-throughput microfluidic single-cell RT-qPCR.Proc Natl Acad Sci U S A (2011).

Wrammert, J. et al. Rapid cloning of high-affinity human monoclonalantibodies against influenza virus. Nature 453, 667-671 (2008).

Wu, X. et al. Focused Evolution of HIV-1 Neutralizing AntibodiesRevealed by Structures and Deep Sequencing. Science 333, 1593-1602(2011).

1. A method comprising: a) sequestering single cells and an mRNA captureagent into individual compartments; b) lysing the cells and collectingmRNA transcripts with the mRNA capture agent; c) isolating the mRNA fromthe compartments using the mRNA capture agent; d) performing reversetranscription followed by PCR amplification on the captured mRNA; and e)sequencing at least two distinct cDNA products amplified from a singlecell.
 2. The method of claim 1, further defined as a method forobtaining a plurality of paired antibody VH and VL sequences wherein thecells are B-cells.
 3. The method of claim 1, wherein the mRNA captureagent is a bead.
 4. The method of claim 3, wherein the beads aremagnetic.
 5. The method of claim 3, wherein the bead comprisesoligonucleotides which hybridize mRNA.
 6. The method of claim 5, whereinthe oligonucleotides comprise at least one of poly(T) and primersspecific to a transcript of interest.
 7. The method of claim 2, furtherdefined as a method for obtaining paired antibody VH and VL sequencesfor an antibody that binds to an antigen of interest.
 8. The method ofclaim 3, wherein the beads are conjugated to the antigen of interest andthe oligonucleotides are only conjugated to the beads in the presence ofan antibody that binds to the antigen of interest. 9-13. (canceled) 14.The method of claim 1, wherein steps (a) and (b) comprise isolatingsingle cells and an mRNA capture agent into individual microvesicles inan emulsion and in the presence of a cell lysis solution. 15-16.(canceled)
 17. The method of claim 1, wherein step (e) comprises linkingcDNA by performing overlap extension reverse transcriptase polymerasechain reaction to link at least 2 transcripts into a single DNAmolecule.
 18. The method of claim 1, wherein step (e) does not comprisethe use of overlap extension reverse transcriptase polymerase chainreaction.
 19. The method of claim 2, wherein step (e) comprises linkingVH and VL cDNAs by performing overlap extension reverse transcriptasepolymerase chain reaction to link VH and VL cDNAs in single molecules.20. The method of claim 2, wherein step (e) does not comprise the use ofoverlap extension reverse transcriptase polymerase chain reaction andwherein the VH and VL cDNAs are separate molecules.
 21. The method ofclaim 2, wherein the VH and VL sequences are obtained by sequencing ofdistinct molecules.
 22. The method of claim 2, further comprisingidentifying the paired antibody VH and VL sequences comprises performinga probability analysis of the sequences. 23-27. (canceled)
 28. Themethod of claim 1, wherein sequestering the single cells comprisesintroducing the cells to a device comprising a plurality of microwellsso that the majority of cells are captured as single cells.
 29. Themethod of claim 1, wherein step (e) comprises sequencing of two or moretranscripts covalently linked to the same bead.
 30. (canceled)
 31. Themethod of claim 1, further comprising determining natively pairedtranscripts using probability analysis. 32-33. (canceled)
 34. A systemcomprising: a) an aqueous fluid phase exit disposed within an annularflowing oil phase; and b) an aqueous fluid phase, wherein the aqueousphase fluid comprises a suspension of cells and is dispersed within theflowing oil phase, resulting in emulsified droplets with low sizedispersity comprising an aqueous suspension of cells. 35-43. (canceled)44. A composition comprising an emulsion having a plurality ofindividual microvesicles, said microvesicles comprising a bead withimmobilized oligonucleotides for priming of reverse transcription andindividual B-cells, which have been disrupted to release mRNAtranscripts from individual B-cells. 45-64. (canceled)